Miminum Entropy Set Cover Problem for Lossy Data Compression

نویسندگان

  • Marek Śmieja
  • Jacek Tabor
چکیده

Classical minimum entropy set cover problem relies on the finding the most likely assignment between the set of observations and the given set of their types. The solution is described by such partition of data space which minimizes the entropy of the distribution of types. The problem finds a natural application in the machine learning, clustering and data classification. In this paper we show that it is closely related to lossy data compression. In particular, we prove that the minimum entropy set cover is a special case of specific generalized entropy coding. We establish the relation between the solution of these two problems. Moreover, we propose a simple greedy algorithm which approximates the entropy of our lossy data compression within an additive term of log2 e. The proof is based on the recent result obtained for minimum entropy set cover and our partition reduction theorem for lossy data compression.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Partition Reduction for Lossy Data Compression Problem

We consider the computational aspects of lossy data compression problem, where the compression error is determined by a cover of the data space. We propose an algorithm which reduces the number of partitions needed to find the entropy with respect to the compression error. In particular, we show that, in the case of finite cover, the entropy is attained on some partition. We give an algorithmic...

متن کامل

Entropy Approximation in Lossy Source Coding Problem

In this paper, we investigate a lossy source coding problem, where an upper limit on the permitted distortion is defined for every dataset element. It can be seen as an alternative approach to rate distortion theory where a bound on the allowed average error is specified. In order to find the entropy, which gives a statistical length of source code compatible with a fixed distortion bound, a co...

متن کامل

Gfwx: Good, Fast Wavelet Codec Ict Tech Report Ict-tr-01-2016

Wavelet image compression is a popular paradigm for lossy and lossless image coding, and the wavelet transform, quantization, and entropy encoding steps are well studied. Efficient implementation is straightforward for the first two steps using e.g. lifting and uniform scalar deadzone quantization, but entropy encoding is typically carried out using complex context modeling and arithmetic codin...

متن کامل

A Survey of Various Data Compression Techniques

This paper is a survey of various methods of data compression. When the computer age came about in the 1940’s, storage space became an issue. Data compression was the answer to that problem. The compression process takes an original data set and reduces its size by taking out unnecessary data. There are two main types of compression, lossy and lossless. This paper will deal exclusively with los...

متن کامل

of PACS browser with the rest of the network Cluster Controller

Despite over a decade of research and development, medical image compression has not yet been widely implemented on clinical picture archiving and communication systems (PACS). We have developed a prototype interface which incorporates both lossless and lossy compression into a browsing system that enables the efficient use of network and storage resources. Such a system allows an user to quick...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012